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Abstract 

We review recent results on the appearance of long-term persistence in climatic 
records and their relevance for the evaluation of global climate models and rare 
events. The persistence can be characterized, for example, by the correlation C(s) 
of temperature variations separated by s days. We show that, contrary to previous 
expectations, C(s) decays for large s as a power law, C(s) ~ s~ 7 . For continen- 
tal stations, the exponent 7 is always close to 0.7, while for stations on islands 
7 = 0.4. In contrast to the temperature fluctuations, the fluctuations of the rainfall 
usually cannot be characterized by long-term power-law correlations but rather by 
pronounced short-term correlations. The universal persistence law for the temper- 
ature fluctuations on continental stations represents an ideal (and uncomfortable) 
test-bed for the state of-the-art global climate models and allows us to evaluate 
their performance. In addition, the presence of long-term correlations leads to a 
novel approach for evaluating the statistics of rare events. 

1. INTRODUCTION 

The persistence of weather states on short terms is a well-known phenomenon: a warm day 
is more likely to be followed by a warm day than by a cold day and vice versa. The trivial 
forecast that the weather of tomorrow is the same as the weather of today was, in previous 
times, often used as a "minimum skill" forecast for assessing the usefulness of short-term 
weather forecasts. The typical time scale for weather changes is about one week, a time 
period which corresponds to the average duration of so-called "general weather regimes" 
or "Grosswetterlagen" , so this type of short-term persistence usually stops after about 
one week. On larger scales, other types of persistence occur, one of them is related to 
circulation patterns associated with blocking [5]. A blocking situation occurs when a very 
stable high-pressure system is established over a particular region and remains in place for 
several weeks. As a result, the weather in the region of the high remains fairly persistent 
throughout this period. Furthermore, transient low-pressure systems are deflected around 
the blocking high so that the region downstream of the high experiences a larger than 
usual number of storms. On even longer terms, a source for weather persistence might be 
slowly varying external (boundary), forcing such as sea surface temperatures and anomaly 
patterns for example. On the scale of months to seasons, one of the most pronounced 
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phenomenon is the El Nino Southern Oscillation (ENSO) event, which occurs every three 
to five years and which strongly affects the weather over the tropical Pacific as well as 
over North America 12 11 . 

The question is, how the persistence, that might be generated by very different mech- 
anisms on different time scales, decays with time s. The answer to this question is not 
simple. Correlations, and in particular long-term correlations, can be masked by trends 
that are generated, for instance, by the well-known urban warming phenomenon. Even 
uncorrelated data in the presence of long-term trends may look like correlated data, and, 
on the other hand, long-term correlated data may look like uncorrelated data influenced 
by a trend. 

Therefore, in order to distinguish between trends and correlations, one needs methods 
that can systematically eliminate trends. Those methods are available now: both wavelet 
techniques (WT) 1 and detrended fluctuation analysis (DFA) 2 can systematically eliminate 
trends in the data and thus reveal intrinsic dynamical properties such as distributions, 
scaling and long-range correlations that are often masked by nonstationarities. 

In recent studies jHJ El UIH 122] we have used DFA and WT to study temperature and 
precipitation correlations in different climatic zones on the globe. The results indicate 
that the temperature variations are long-range power-law correlated above some crossover 
time that is of the order of 10 days. Above 10 d, the persistence, characterized by the 
autocorrelation C(s) of temperature variations separated by s days, decays as 



where, most interestingly, the exponent 7 has roughly the same value 7 = 0.7 for all 
continental records. For small islands the correlations are more pronounced, with 7 around 
0.4. This value is close to the value obtained recently for correlations of sea-surface 
temperatures J7j. In marked contrast, for most stations the precipitation records do 
not show indications of long-range temporal correlations on scales above 6 months. Our 
results are supported by independent analysis by several groups jTHl UH I2H] • 

The fact that the correlation exponent varies only very little for the continental atmo- 
spheric temperatures, presents an ideal test-bed for the performance of the global climate 
models, as we will show below. We present an analysis of the two standard scenarios 
(greenhouse gas forcing only and greenhouse gas plus aerosols forcing) together with the 
analysis of a control run. Our analysis points to clear deficiencies of the models. For fur- 
ther discussions we refer to Govindan et al. jHUHl- Finally, we review a recent approach 
to determine the statistics of rare events in the presence of long-term correlations. 

The chapter is organized in five sections. In Section 2, we describe one of the de- 
trending analysis methods, the detrended fluctuation analysis (DFA). In Section 3, we 
review the application of this method to both atmospheric temperature and precipitation 
records. In Section 4, we describe how the "universal" persistence law for the atmospheric 
temperature fluctuations on continental stations can be used to test the three scenarios 
of the state-of-the-art climate models. In Section 5, finally, we describe how the common 
extreme value statistics is modified in the presence of long-term correlations. 



Consider, for example, a record T,, where the index i counts the days in the record, 
i = 1,2, ...,N. This record Tj may represent the maximum daily temperature or the daily 
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1 THE METHODS OF ANALYSIS 
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amount of precipitation, measured at a certain meteorological station. For eliminating 
the periodic seasonal trends, we concentrate on the departures of the T, ATj = Tj — T i} 
from their mean daily value Tj for each calendar date i, say 1st of April, which has been 
obtained by averaging over all years in the record. 

Quantitatively, correlations between two ATj values separated by n days are defined 
by the (auto)-correlation function 

i N—n 

C(n) = (ATATi+n) = Tt E > ^T t AT l+n (2). 

If the ATj are uncorrelated, C(n) is zero for n positive. If correlations exist up to a certain 
number of days n x , the correlation function will be positive up to n x and vanish above 
n x . A direct calculation of C(n) is hindered by the level of noise present in the finite 
records, and by possible nonstationarities in the data. 

To reduce the noise we do not calculate C{n) directly, but instead study the "profile" 

m 

Y m = E AT. (3). 

i=l 

We can consider the profile Y m as the position of a random walker on a linear chain 
after m steps. The random walker starts at the origin and performs, in the ith step, 
a jump of length ATj to the right if ATj is positive, and to the left if ATj is negative. 
The fluctuations F 2 (s) of the profile, in a given time window of size s, are related to the 
correlation function C(s). For the relevant case (1) of long-term power-law correlations, 
C(s) ~ s~ 7 , < 7 < 1, the mean-square fluctuations F 2 (s), obtained by averaging over 
many time windows of size s (see below) asymptotically increase by a power law j^j, 

F*(s) ~ s 2a , a = l- 7 /2. (3) 

For uncorrelated data (as well as for correlations decaying faster than 1/s), we have 
a = 1/2. 

For the analysis of the fluctuations, we employ a hierarchy of methods that differ in 
the way the fluctuations are measured and possible trends are eliminated (for a detailed 
description of the methods we refer to Kantelhardt et al. |H]). 

1. ) In the simplest type of fluctuation analysis (FA) (where trends are not going to be 
eliminated), we determine the difference of the profile at both ends of each window. The 
square of this difference represents the square of the fluctuations in each window. 

2. ) In the first-order detrended fluctuation analysis (DFAl), we determine in each 
window the best linear fit of the profile. The variance of the profile from this straight line 
represents the square of the fluctuations in each window. 

3. ) In general, in the nth order DFA (DFAn) we determine in each window the best nth 
order polynomial fit of the profile. The variance of the profile from these best rath-order 
polynomials represents the square of the fluctuations in each window. 

By definition, FA does not eliminate trends similar to the Hurst method and the 
conventional power spectral methods |7]. In contrast, DFAn eliminates trends of order 
n in the profile and n — 1 in the original time series. Thus, from the comparison of 
fluctuation functions F(s) obtained from different methods, one can learn about long- 
term correlations and types of trends, which cannot be achieved by the conventional 
techniques. 
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Figure 1: Analysis of daily temperature records of four representative weather stations. 
The four figures show the fluctuation functions obtained by FA, DFA1, DFA2, DFA3, 
DFA4, and DFA5 (from top to bottom) for the four sets of data. The scale of the 
fluctuation functions is arbitrary. In each panel, a line with slope 0.65 is shown as a guide 
to the eye. 

2. ANALYSIS OF TEMPERATURE AND PRECIPITATION 
RECORDS 

Figure 1 shows the results of the FA and DFA analysis of the maximum daily temper- 
atures Tj of the following weather stations (the length of the records is written within 
the parentheses): (a) Cheyenne (USA, 123 y), (b) Edinburgh (UK, 102 y), (c) Campbell 
Island (New Zealand, 57 y), and (d) Sonnblick (Austria, 108 y). The results are typical 
for a large number of records that we have analyzed so far (see Eichner et al. 6\ and 
Koscielny-Bunde et al. |16[ I15j). Cheyenne has a continental climate, Edinburgh is on a 
coastline, Campbell Island is a small island in the Pacific Ocean, and the weather station 
of Sonnblick is on top of a mountain. 

In the log-log plots, all curves are (except at small s values) approximately straight 
lines. For both the stations inside the continents and along coastlines, the slope is 
a = 0.65. There exists a natural crossover (above the DFA crossover) that can be best 
estimated from FA and DEAL As can be verified easily, the crossover occurs roughly 
at t c = lOd, which is the order of magnitude for a typical Grosswetterlage. Above t c , 
there exists long-range persistence expressed by the power-law decay of the correlation 
function with an exponent 7 = 2 — 2a = 0.7. These results are representative for the large 
number of records we have analyzed. They indicate that the exponent is "universal" , i.e., 
does not depend on the location and the climatic zone of the weather station. Below t c , 
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Figure 2: Analysis of daily precipitation records of four representative weather stations. 
The four figures show the fluctuation functions obtained by FA, DFAl, DFA2, DFA3, 
DFA4, and DFA5 (from top to bottom) for the four sets of data. The scale of the 
fluctuation functions is arbitrary. In each panel, a line with slope 0.5 is shown as a guide 
to the eye. 

the fluctuation functions do not show universal behavior and reflect the different climatic 
zones. 

However, there are exceptions from the universal behavior, and these occur for loca- 
tions on small islands and on top of large mountains. In the first case, the exponent can 
be considerably larger, a = 0.8, corresponding to 7 = 0.4. In the second case, on top of 
a mountain, the exponent can be smaller, a = 0.58, corresponding to 7 = 0.84. 

Next we consider precipitation records. Figure 2 shows the results of the FA and DFA 
analysis of the daily precipitation Pj of the following weather stations (the length of the 
records is written within the parentheses): (a) Cheyenne (USA, 117 y), (b) Edinburgh 
(UK, 102 y), (c) Campbell Island (New Zealand, 57 y), and (d) Sonnblick (Austria, 108 
y). The results are typical and represent a large number of records that we have analyzed 
so far |22]). 

In the log-log plots, all curves are (except at small s values) approximately straight 
lines at large times, with a slope close to 0.5. If there exist long-term correlations, then 
they are very small. Some exceptions are again stations on top of a mountain, where 
the exponent might be around 0.6, but this happens only very rarely. In most cases, the 
exponent is between 0.5 and 0.55, pointing to uncorrelated or weakly correlated behavior 
at large time spans. Unlike the temperature records, the exponents actually do not depend 
on specific climatic or geographic conditions. 

Figure 3 summarizes the results for exponents a for (a) temperature records and (b) 
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Figure 3: Histograms of the values of the fluctuation exponents a (a) for daily temperature 
records and (b) for daily precipitation records. 



precipitation records. Different climatological conditions are marked in the histograms. 
First we concentrate on the temperature records (fig. 3a). One can see clearly that for 
stations that are neither on islands nor on summits, the average exponent is close 0.65, 
with a variance of 0.03. For the islands (where only few records are available) the average 
value of a is 0.78, with quite a large variance of 0.08. The variance is large, since stations 
on larger islands, like Wrangelija, behave more like continental stations, with an exponent 
close to 0.65. For the precipitation records (fig. 3b), the average exponent a is close to 
0.54, with a variance close to 0.05, and does not depend significantly on the climatic 
conditions around a weather station. 

Since for the temperature records the exponent for continental and coastline stations 
does not depend on the location of the meteorological station and its local environment, 
the power-law behavior can serve as an ideal test for climate models where regional details 
cannot be incorporated and therefore regional phenomena like urban warming cannot be 
accounted for. The power-law behavior seems to be a global phenomenon and therefore 
should also show up in the simulated data of the global climate models (GCM). 



3. TEST OF GLOBAL CLIMATE MODELS 

The state-of-the-art climate models that are used to estimate future climate are coupled 
atmosphere-ocean general circulation models (AOGCMs) fUl The models provide 
numerical solutions of the Navier Stokes equations devised for simulating meso-scale to 
large-scale atmospheric and oceanic dynamics. In addition to the explicitly resolved scales 
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of motions, the models also contain parameterization schemes representing the so-called 
subgrid-scale processes, such as radiative transfer, turbulent mixing, boundary layer pro- 
cesses, cumulus convection, precipitation, and gravity wave drag. A radiative transfer 
scheme, for example, is necessary for simulating the role of various greenhouse gases such 
as CO2 and the effect of aerosol particles. The differences among the models usually lie 
in the selection of the numerical methods employed, the choice of the spatial resolution 3 , 
and the subgrid-scale parameters. 

Three scenarios have been studied by the models, and the results are available, for 
four models, from the IPCC Data Distribution Center ^3]. The first scenario represents 
a control run where the CO2 content is kept fixed. In the second scenario, one considers 
only the effect of greenhouse gas forcing (GHG). The amount of greenhouse gases is taken 
from the observations until 1990 and then increased at a rate of 1% per year. In the 
third scenario, the effect of aerosols (mainly sulfates) in the atmosphere is taken into 
account. Only direct sulfate forcing is considered; until 1990, the sulfate concentrations 
are taken from historical measurements , and are increased linearly afterwards. The effect 
of sulfates is to mitigate and partially offset the greenhouse gas warming. Although 
this scenario represents an important step towards comprehensive climate simulation, it 
introduces new uncertainties - regarding the distributions of natural and anthropogenic 
aerosols and, in particular, regarding indirect effects on the radiation balance through 
cloud-cover modification etc. [TT] . 

For the test, we consider the monthly temperature records from those four AOGCMs. 
Data for these three scenarios are available from the Internet: CSIRO-Mk2 (Melbourne), 
CCSR/NIES (Tokyo), ECHAM4/OPYC3 (Hamburg), and CGCM1 (Victoria, Canada). 
We extracted the data for six representative sites around the globe (Prague, Kasan, Seoul, 
Luling [Texas], Vancouver, and Melbourne). For each model and each of the three scenar- 
ios, we selected the temperature records of the four grid points closest to each site, and 
bilinearly interpolated the data to the location of the site. Figure 4 shows representative 
results of the fluctuation functions, calculated using DFA3, for two sites (Kasan [Russia] 
and Luling [Texas]) for the four models and the three scenarios. As seen in figure 4 most 
of the DFA curves approach the slope of 0.5. However, the control runs seem to show a 
somewhat better performance, i.e., many of them have a slope close to 0.65 (e.g., Luling 
(CSIRO-Mk2)), and the greenhouse gas only scenario show the worst performance. The 
actual long-term exponents a for the three scenarios of the 4 models for the 6 cities are 
summarized in figure 5(a)-(c). Each histogram consists of 24 blocks and every block is 
specified by the model and the city. 

For the control run (fig. 5(a)) there is a peak at a = 0.65 but more than half of 
the exponents are below a = 0.62. For the greenhouse gas only scenario (fig. 5(b)), the 
histogram shows a pronounced maximum at a = 0.5. For best performance, all models 
should have exponents a close to 0.65, corresponding to a peak of height 24 in the window 
between 0.62 and 0.68. Actually, more than half of the exponents are close to 0.5, while 
only 3 exponents are in the proper window between 0.62 and 0.68. Figure 5(c) shows 
the histogram for the greenhouse gas plus aerosol scenario, where, in addition to the 
greenhouse gas forcing, also the effects of aerosols are taken into account. For this case, 
there is a pronounced maximum in the a window between 0.56 and 0.62 (more than half 
of the exponents are in this window), while again only 3 exponents are in the proper range 
between 0.62 and 0.68. This shows that although the greenhouse gas plus aerosol scenario 
is also far from reproducing the scaling behavior of the real data, its overall performance 
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Figure 4: Comparison of the scaling performance of the three scenarios: control run (A), 
greenhouse gas forcing only (o) and greenhouse gas plus aerosols (o). All curves are 
obtained by applying DFA3 to the monthly mean of the daily maximum temperatures 
generated by the four AOGCMs. The lines with slopes 0.65 and 0.5 are shown as a guide 
to the eye. For details of the records, we refer to [T3] . 
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Figure 5: Histograms of the values of the fluctuation exponent (a) obtained from the 
simulations of the four AOGCMs (listed in (a)), for six sites (listed in (b)). The three 
panels are for the three scenarios: (a) control run, (b) greenhouse gas forcing only, and 
(c) greenhouse gas plus aerosol forcing. The entries in each box represent " Model - Site" . 



is better than the performance of the greenhouse gas scenario. The best performance 
is observed for the control run, which points to remarkable deficiencies in the way the 
forcings are introduced into the models. 

4. EXTREME VALUE STATISTICS 

The long-term correlations in a record {T;} effect strongly the statistics of the extreme 
values in the record, as has been shown recently in [2]. The central quantity in Extreme 
Value Statistics (EVS) is the return time r q between two events of size greater or equal to 
a certain threshold q. The basic assumption in conventional EVS is, that the events are 
uncorrelated (at least when the time lag between them is sufficiently large). In this case, 
one can obtain the mean return time R q simply from the probability W q that an event 
greater or equal to q occurs, R q = l/W q . Since the events are uncorrelated, also the 
return intervals are uncorrelated and follow the Poisson-statistics; i.e. their distribution 
function P q (r) is a simple exponential, P q (r) ~ exp(— r/R q ). 

For long-term correlations, it has been shown in [2] that the distribution function P q (r) 
changes into a stretched exponential function, 

P q (r) ~ exp [-const (r / FL q )~% (4) 

for 7 between zero and one. For 7 above one, in the case of short-term correlations, P q 
reduces to the Poisson distribution. 

In addition, the return intervals become long-term correlated, with an exponent that 
is approximately identical to 7 [2]. This is seen in Fig. 6, where the fluctuation functions 
F{s) of the return intervals (obtained by DFA2) are shown for two artificial long-term 
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Figure 6: Fluctuation functions F(s) for the record of return intervals obtained from two 
artificial long-term correlated records {Tj} with 7 = 0.4 (upper curves) and 7 = 0.7 
(lower curves). The distribution of the T values has been chosen as Gaussian, with zero 
mean and variance one. For the return intervals, the thresholds q = 1.5 and 2.5 have 
been considered. The straight lines in the figure have slopes a = 1 — 7/2, suggesting 
that the return intervals are long-term correlated in the same way as the Tj 

correlated records with 7 = 0.4 and 0.7. Two values of thresholds q = 1.5 and 2.5 have 
been considered for each value of 7. The distribution of Tj values has been chosen as 
Gaussian, with zero mean and variance one. In the double logarithmic plot, all the curves 
approach straight lines with slopes a = 1 — 7/2, suggesting that the return intervals are 
long-term correlated in the same way as the Tj. 

As a consequence, small return intervals are more likely to be followed by small inter- 
vals and large intervals are more likely to be followed by large intervals. Accordingly, for 
long-term correlated records it is more likely than for uncorrelated records that a sequence 
of large return times is followed by a sequence of short return times. 

This fact may be relevant for the occurence of floods. It is well known that river 
flows are long-term correlated with exponents 7 between 0.3 and 0.9, in most cases close 
to 0.4. In the last decades, the frequency of large floods in Europe has increased. It is 
possible, that this increase is due to global warming, but it is also possible that it has 
been triggered by the long-term correlations. 
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